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Abstract: Aspergillus flavus and A. parasiticus infect peanut seeds and produce aflatoxins, 
which are associated with various diseases in domestic animals and humans throughout the 
world. The most cost-effective strategy to minimize aflatoxin contamination involves the 
development of peanut cultivars that are resistant to fungal infection and/or aflatoxin 
production. To identify peanut Aspergillus-'mteractiwe and peanut Aspergillus-resistance 
genes, we carried out a large scale peanut Expressed Sequence Tag (EST) project which 
we used to construct a peanut glass slide oligonucleotide microarray. The fabricated 
microarray represents over 40% of the protein coding genes in the peanut genome. For 
expression profiling, resistant and susceptible peanut cultivars were infected with a mixture 
of Aspergillus flavus and parasiticus spores. The subsequent microarray analysis identified 
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62 genes in resistant cultivars that were up-expressed in response to Aspergillus infection. 
In addition, we identified 22 putative Aspergillus-resistance genes that were constitutively 
up-expressed in the resistant cultivar in comparison to the susceptible cultivar. Some of 
these genes were homologous to peanut, corn, and soybean genes that were previously 
shown to confer resistance to fungal infection. This study is a first step towards a 
comprehensive genome-scale platform for developing Aspergillus-resistant peanut cultivars 
through targeted marker-assisted breeding and genetic engineering. 

Keywords: EST; microarray; gene profiling; peanut-fungus interaction; resistance genes; 
Aspergillus flavus; A. parasiticus; metarep 



1. Introduction 

Peanut (Arachis hypogaea L.) has been an important food and oil crop. Peanut contains not only a 
high percentage of oil (about 50%) but also contains a high quality unsaturated fatty acid (oleic acid). 
These features confer superior oxidative stability for food products without further processing. Peanut 
oil is also low in saturated fat and rich in resveratrol, antioxidants, and other nutriceuticals, which may 
contribute to cardiovascular health. Currently, peanut is grown world-wide, predominantly in Asia, 
Africa, and North Americas, with about 21 million hectares under cultivation. World peanut 
production occupies an important role in the world economy with an estimated production value of 
about $35 billion. 

Research on the peanut genome is at an early stage. Major crop improvement emphasis is focused 
on using elite genetic stocks, cultural management, and disease and pest control measures to improve 
productivity and quality. Traditionally cultivar improvement has been limited by conventional 
breeding and selection strategies [1]. High throughput technologies such as whole genome and 
transcriptome sequencing and microarray analysis hold promise to greatly facilitate this process. To 
meet the needs of the peanut industry, the international research community developed the 
International Peanut Genomics Initiative to coordinate sequencing the complete peanut genome 
(http://www.peanutbioscience.com/peanutgenomeinitiative.html) [2,3]. Peanut is a polyploid organism 
with a large genome size (2.8 Gb), which makes whole genome sequencing prohibitively expensive. 
Furthermore, due to its polyploid nature, assembly, annotation, and analysis of the genome will be a 
very challenging task. Thus, alternative approaches such as Expressed Sequence Tag (EST) 
sequencing have been implemented to advance the understanding of the genome at a manageable cost. 

Several research institutes have undertaken low to middle scale peanut Expressed Sequence Tag 
(EST) projects [4-6]. As early as 2005, Luo et al. [6] released the first batch of EST sequences from 
two cultivated peanut lines, which were later used to design the first peanut microarray [7,8]. 
Subsequently, our research group at the USDA reported a total 41,568 ESTs derived from Tifrunners 
and the breeding line GT-C20 [4,5]. Another group in Belgium generated 4847 ESTs from peanut 
mixed stages infected with the migratory peanut pod nematode [9]. A group at the University of 
Florida used suppression subtractive hybridization to identify differentially expressed ESTs from 
RKN-challenged root tissues in nematode-resistant and -susceptible peanut cultivars [10]. Lately, the 
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Shandong Academy of Agricultural Sciences, China, has started a large scale EST project and has 
provided 17,000 expressed sequence tags (ESTs) [11]. 

With the increased awareness of aflatoxin contamination in peanut [2], the presence of aflatoxin in 
peanut products has become a serious food safety concern. It is a major financial concern to the peanut 
industry as more regulatory import measures take effect worldwide. Aflatoxin contamination in 
pre-harvested peanuts is caused by the infection of the Aspergillis species, mainly A. flavus and 
A. parasiticus. Understanding peanut-fungus interactions during the growth of both the peanut crop 
and the fungus is necessary to develop effective strategies to reduce or eliminate aflatoxin 
contamination of pre- and post-harvest peanut crop. Currently, peanut cultivars that are resistant to 
A. flavus and A. parasiticus infection are rare, and little is known about the molecular mechanisms that 
confer such resistance. 

To gain a better understanding of these mechanisms, the USD A has initiated the peanut genome 
program [2]. We recently [12] developed and tested the utility of the first large-scale peanut 
microarray, investigating the gene expression in different peanut tissues such as pod, leaf, stem, root, 
and peg tissues. The study identified 108 putatively pod- specific/abundant genes [12]. Subsequently, 
as part of U.S. Peanut Genome Initiative supported by U.S. Industry and Peanut Growers, our group 
developed a large scale peanut EST project [2,4,13] for the cultivated peanut and provided the genomic 
resources for use in marker development and gene discovery. Here we report the development of a 
peanut microarray based on these EST sequences as well as other publicly available peanut EST 
sequences down-loaded from dbEST database (NCBI, http://www.ncbi.nlm.nih.gov/) [14]. We 
employed this array in gene expression profiling experiments to identify candidate genes that confer 
resistance to Aspergillus infection due to up-expression in response to fungal infection using a resistant 
peanut line vs. a susceptible line. 

2. Materials and Methods 

2.1. Peanut Lines Used 

Two peanut lines (cultivars) have been used in this experiment: Tifrunner and GT-C20, hereafter 
referred as C20. "Tifrunner" (TF) is a runner market-type peanut (Arachis hypogaea L. 
subsp. hypogaea var. hypogaea) cultivar with a high level of resistance to Tomato Spotted Wilt Virus 
(TSWV), moderate resistance to early (Cercospora arachidicola) and late leaf spot (Cercosporidium 
personatum), but it is a late maturity cultivar [15]. This cultivar is considered susceptible to 
Aspergillus infection in the field. "GT-C20" is a Spanish-type breeding line and highly susceptible to 
TSWV and leaf spots but resistant to aflatoxin contamination [16]. 

2.2. Peanut Inoculation by Aspergillus during Growth 

Both resistant and susceptible peanut cultivars were subjected to infection with a mixture of 
A. flavus and A. parasiticus spores 60 days after planting (DAP). In order to mimic peanut field fungal 
population, A. parasiticus NRRL 2999 and A. flavus NRRL 3357 were used for inoculation because 
they are pre-dominant fungal strains in our peanut field. Peanut immature kernels were harvested 
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30 days after inoculation. Total RNAs were isolated from these immature kernel seeds. Poly-A 
mRNAs were prepared from the total RNAs immediately prior to cDNA library construction. 

2.3. Expressed Sequence Tags and Sequencing 

Tissue collection, RNA isolation, cDNA library construction and sequencing were done at 
USDA-ARS, Crop Protection and Management Research Unit at Tifton, Georgia and US Horticultural 
Laboratory Genomics Research Center at Ft. Pierce, Florida. The peanut plant materials used for RNA 
extraction were grown in the field and inoculated at mid-bloom (60 DAP). Drought stress was imposed 
during the final 40 days before harvest through the use of rain-out shelters. Immature pods at the R5 
(beginning seed), R6 (full seed) and R7 (beginning maturity) stages from "GT-C20" and "Tifrunner" 
were collected, frozen in liquid nitrogen, and stored at -80 °C until RNA extraction. Leaf tissues were 
collected at 100 DAP under the natural occurrence of spotted wilt and leaf spot diseases of peanut 
genotypes, Tifrunner, GT-C20 and A13 [6,7]. Tissues were frozen in liquid nitrogen and stored at 
-80 °C until RNA extraction by Trizol extraction. Tifrunner is resistant to TSWV and leaf spots, but 
susceptible to Aspergillus flavus. GT-C20 is susceptible to TSWV and leaf spots but resistant to 
A. flavus, and A13 (NCV11 x AR4) is moderately resistant to TSWV and leaf spots, and resistant to 
A. flavus infection [17]. 

EST libraries were constructed using the pBluescript® II XR cDNA Library Construction Kit 
(Stratagene, La Jolla, California, Catalog). Briefly, directional cDNA synthesis was made by attaching 
5' EcoRI and 3' Xhol adaptors (oligo dT Xhol primer). After digesting with EcoRI and Xhol restriction 
enzymes, the cDNA inserts were ligated into the multicloning sites of pBluescript II SK (+) plasmid 
vector. The cDNAs in the pBluescript vector were sequenced using universal primers (5' T3 primer). 
Single pass, unidirectional (5' end) sequencing was performed using ABI 3730x1 Genetic analyzer 
(Applied Biosystems) with the ABI Prism BigDye terminator cycle sequencing kit (Foster City, CA). 
Base calling was made using Phred and Trace Tuner (Paracel, Pasadena, CA, USA). The sequencing, 
sequence cleaning, end trimming, and assembly processing were performed in the Laboratory for 
Genomics and Bioinformatics, University of Georgia. 

2.4. Oligo Microarray Design 

The printed oligonucleotide sequences and the array platform description can be found at the NCBI 
GEO database (accession GPL13178). Briefly, oligonucleotides ranging from 60 to 70 mer were 
designed at the J. Craig Venter Institute (JCVI) and synthesized by Sigma-Aldrich (Saint Louis, MO). 
The total number of oligonucleotides spotted on the microarray was 6932, which represented 6932 
peanut unigenes. They were spotted to Corning ultraGAPs glass slides with 3 replications of each 
oligonucleotide at different locations on the slide. With flip-dye hybridizations, the array platform 
generates 3 technical replications per hybridization. 

2.5. Microarray Experiment Design, Hybridization and Analysis 

Two factors were varied in the experimental design: peanut cultivars (TF and C20) and Aspergillus 
exposure. Combinations of these two factors allowed for four hybridization probe pairs for competitive 
hybridization as follows: 
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• C20Y vs. TFY (GT-C20 infected vs. Tifrunner infected) 

• C20Y vs. C20N (GT-C20 infected vs. not infected) 

• TFY vs. TFN (Tifrunner infected vs. not infected) 

• C20N vs. TFN (GT-C20 not infected vs. Tifrunner not infected) 

The four samples were analyzed with four hybridizations each with a flip-dye control and three 
in-slide replicates as described (GEO records: GSM684493, GSM684512, and GSM684513). 

2.6. Data Processing for EST and Microarray Analysis 

Sequencing trace files from the cDNA peanut library were processed following the JCVI Sanger 
pipeline, which trims off vector and adaptor sequences and removes low-quality bases. Sequences 
sharing overlapping regions of greater than 94% identity over 40 or more continuous bases were 
assembled at high stringency using the CAP3 program and Paracel Transcript Assembler [18]; version 
2.6.2, (http://www.paracel.com) [19] with modifications by the JCVI bioinformatics team. Overlaps 
based exclusively on low-complexity regions were excluded. 

Hybridized slides were scanned using the standard protocol (see GEO records: GSM684493, 
GSM684512, and GSM684513 for details). All calculated gene expression ratios were log2-transformed 
and analyzed using MeV (http://www.tm4.org/mev.html) [20-22]. 

A gene was considered to be expressed if it had a positive expression value associated with it. Log2 
ratios were used to measure relative changes in expression level between two growth conditions. 
Genes were considered differentially expressed if the corresponding log2 ratios were greater than 2. 
Gene Ontology (GO), enzyme classification (EC), and PFAM term enrichment analysis was 
performed using METAREP, an online annotation presentation tool developed at the JCVI 
(http://www.jcvi.org/metarep/dashboard/index) [23] . 

3. Results and Discussion 

3.1. Summary Classification of Expressed Sequence Tags(EST) 

A total of over 11,141 ESTs were assembled from over 100,000 Sanger reads generated in this 
study. Additional 2738 EST sequences were downloaded from the NCBI dbEST database including 
those sequences submitted by Shandong Academy of Agricultural Sciences. From this dataset, 13,879 
unique ESTs (unigenes) have been assembled and annotated. The average GC content of these ESTs is 
42.6% with the minimum GC of 15.8% and maximum GC of 72.5%. It is estimated that the 2.8-Gb 
peanut genome hosts 25,000-35,000 protein-coding genes, therefore 13,879 ESTs represent over 40% 
of these genes. BLASTp search against the NCBI NR database showed that 1761 ESTs (12.7%) can be 
assigned a putative function based on sequence similarity to previously characterized proteins. 
However 87.3% of the ESTs (12,118 ESTs) did not have significant hits in the database and were 
annotated as 'hypothetical". Major functional categories represented in this EST set are listed in 
Table 1. The EST sequence data have been submitted to the NCBI EST database (ES702769 to 
ES724546 and ES751523 to ES768453). 
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Table 1. Classification of identified genes in peanut. 



Category of Genes 


Number of Genes 


Hypothetical proteins 


12,118 


Ribosomal protein 


131 


Lopprotein 


91 


Cupin 


54 


Ribulose bisphophate carboxylase 


36 


Oleisin 


33 


Conglutin 


32 


Photosystem I and II 


29 


Protease inhibitor/seed storage protein 


28 


Core histone 


25 


Ara H8 allergen/alergen 


25 


Ubiquitin-conjugating enzyme 


23 


Peptidases 


22 


Epoxide hydrolase 


19 


Ras family protein 


16 


Glutathionine S-transferase 


16 


Zinc figure protein 


14 


Seed maturation protein 


13 


NAD/NADH dehydrogenase 


12 


Mem brane protein 


12 


Hsp20 


11 


Peroxidase 


10 


14-3-3 protein 


10 


Universal stress protein 


9 


Oxidoreductase 


9 


HMG(high mobility group) box 


8 


Protein kinase 


6 


Polygalacturonase 


4 


Other 


1063 



3.2. Identification of Resistant Genes to Aspergillus Infection Using Microarray Expression Data 

A 6932 gene-element oligonucleotide microarray was designed according to the 13,879 EST 
sequence information data set. Four microarray hybridizations were performed. We compared resistant 
peanut line, GT-C20, and susceptible peanut line, Tifrunner, under Aspergillus infected and 
non-infected conditions (C20Y vs. TFY; C20Y vs. C20N; C20N vs. TFN and TFY vs. TFN). The gene 
expression level is reported as log2 ratios of relative intensity. Among the 6932 genes whose RNA 
level was detected by the microarray, there were 401 genes that showed significant changes in gene 
expression level between resistant and susceptible peanut lines under infected and non-infected 
conditions. For each specific microarray hybridization, the number of up (log2 > 1.5) and down 
(log 2 < -1.5) expressed genes are summarized in Table 2. It is interesting to find that there were a large 
number of genes in the resistant peanut line GT-C20 either highly or moderately up-expressed. 
Without Aspergillus infection (C20N vs. TFN), there were 9 and 31 genes in GT-C20 that scored as 
highly and moderately up-expressed compared with the susceptible line Tifrunner (C20N vs. TFN). 
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With Aspergillus infection, the highly and moderately up-expressed genes were 25 and 40 respectively 
compared to the same strain without infection (C20Y vs. C20N). More interestingly, the resistant line, 
GT-C20, demonstrated a greater response to Aspergillus infection than the susceptible line Tifrunner 
(C20Y vs. TFY). The highly and moderately up-expressed genes were 52 and 126 respectively 
(C20Y vs. TFY). On the other hand, the susceptible line Tifrunner showed almost no response to 
Aspergillus infection (TFY vs. TFN). When under challenge by Aspergillus species, only one gene 
showed moderate up expression and four genes showed moderate down expression. 



Table 2. Statistics of differentially expressed genes among 6932 expressed genes in peanut 
as detected by microarray. 



Differential Expression 


Hybridizations 


Up-high 


Up-mod 


Down-high 


Down-mod 


(Log 2 > 2) 


(Log 2 > 1.5 &< 2) 


(Log 2 <-2) 


(Log 2 <-1.5&>-2) 


C20Y vs. TFY 


52 


126 


51 


99 


C20Y vs. C20N 


25 


40 


9 


38 


C20N vs. TFN 


9 


31 


3 


19 


TFY vs. TFN 


0 


1 


0 


4 



Table 3 shows the 62 genes among the 178 up-expressed genes shown in Table 2 column 1 and 
column 2 (52 up-high and 126 up-mod) that were consistently highly up-expressed in response to 
Aspergillus infection in GT-C20 across two experiments (C20Y vs. TFY; C20Y vs. C20N) with 
expression levels significantly elevated (log 2 > 1.5). While under non-infection condition, the expression 
levels are about the same as the susceptible line (C20N vs. TFN) (Table 3). Unfortunately, among the 
62 expression elevated genes, only 8 genes were assigned biological functions based on their 
homologies to the corresponding genes in the database. The remaining 54 genes were classified as 
hypothetical proteins with no homologs in the existing database. From the consolidated data, we 
identified 22 genes in the resistant line (GT-C20) that were constitutively up-expressed compared with 
the susceptible line (Tifrunner) under infected (Table 4, C20Y vs. TFY, log2 values > 1.5) and 
non-infected conditions (Table 4, C20N vs. TFN, log2 values > 1.0). Among the 22 genes, 5 genes 
showed slightly up-expression in response to Aspergillus infection compared with non-infection 
conditions (C20Y vs. C20N). Table 5 lists 42 genes in the resistant line GT-C20 that were consistently 
highly down-expressed in response to Aspergillus infection. Table 6 lists 24 genes in the resistant line 
GT-C20 that were constitutively down-expressed in the absence of infection or slightly down-expressed 
in response to Aspergillus infection (C20Y vs. TFY). 
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Table 3. Peanut genes consistently highly expressed in response to fungal infection. 



Oligo Name 


Locus IDs 


Annotation 


C20Y 


C20Y 


C20N 


TFY 






Common Name 


vs. 


vs. 


vs. 


vs. 








TFY 


C20N 


TFN 


TFN 


AH000387 


C20L-061_A09.abl 


hypothetical protein 


3.16 


1.43 


0.17 


-0.53 


AH001961 


gil 134038849 


hypothetical protein 


2.77 


2.29 


0.20 


-0.56 


AH001521 


CL3249Contigl 


hypothetical protein 


2.72 


1.84 


-0.04 


-0.07 


AH000746 


C20L-034_H09.abl 


hypothetical protein 


2.57 


2.31 


-0.45 


-0.94 


AH003951 


CL1062Contigl 


Cupin II Oxalate oxidase 


2.54 


2.42 


-0.27 


-0.22 


AH006882 


CL1197Contigl 


hypothetical protein 


2.46 


2.32 


-0.80 


-0.78 


AH005123 


CL2491Contigl 


hypothetical protein 


2.44 


2.27 


-0.84 


-0.75 


AH007217 


SCLlContig27 


hypothetical protein 


2.43 


1.96 


0.43 


-1.05 


AH005015 


gil56552992 


hypothetical protein 


2.43 


2.49 


-0.63 


-0.85 


AH000635 


CL647Contigl 


hypothetical protein 


2.39 


2.38 


-0.19 


-0.50 


AH002603 


CL129Contigl 


PA domain II Cucumisin 


2.38 


2.26 


-0.90 


-1.11 


AH002125 


gill 16359805 


hypothetical protein 


2.36 


1.11 


0.89 


-1.13 


AH003731 


CL533Contigl 


hypothetical protein 


2.36 


1.65 


-0.64 


-1.26 


AH004262 


CL497Contig2 


hypothetical protein 


2.35 


2.17 


0.75 


-0.38 


AH002955 


CL2001Contigl 


hypothetical protein 


2.35 


2.37 


-0.10 


-0.43 


AH001895 


CL282Contigl 


hypothetical protein 


2.34 


1.23 


0.83 


-0.57 


AH002758 


C20L-064_H03.abl 


hypothetical protein 


2.33 


2.44 


-1.21 


-0.47 


AH004570 


CL3695Contigl 


hypothetical protein 


2.30 


2.43 


0.37 


0.42 


AH006890 


CL1262Contigl 


SCP-like extracellular 
protein 


2.29 


2.00 


-0.75 


-1.19 


AH002029 


CL2422Contigl 


hypothetical protein 


2.28 


1.12 


0.02 


-0.91 


AH001450 


CL3051Contigl 


hypothetical protein 


2.24 


1.08 


0.37 


-0.25 


AH006106 


CL1112Contigl 


hypothetical protein 


2.22 


1.97 


-0.13 


-0.24 


AH002617 


gil 134037331 


hypothetical protein 


2.20 


1.25 


0.66 


-0.53 


AH002238 


CL3Contig8 


hypothetical protein 


2.18 


2.19 


-0.70 


-0.08 


AH003578 


CL1337Contigl 


proline -rich protein 


2.16 


2.01 


-1.09 


-0.09 


AH002427 


CL516Contigl 


trypsin protein inhibitor 1 


2.16 


1.27 


0.17 


-0.45 


AH003300 


C20L-075_C10.abl 


hypothetical protein 


2.13 


1.45 


NaN 


-1.10 


AH000527 


gil 149223227 


hypothetical protein 


2.12 


1.42 


0.11 


0.18 


AH007516 


gil 14965 1508 


hypothetical protein 


2.09 


2.09 


-0.93 


-0.68 


AH000622 


CL818Contigl 


hypothetical protein 


2.08 


1.91 


-0.60 


-0.51 


AH004225 


CL2026Contigl 


hypothetical protein 


2.08 


2.15 


-0.11 


-0.31 


AH003174 


gill 15597367 


hypothetical protein 


2.05 


1.45 


0.47 


-0.39 


AH001408 


CL1472Contigl 


hypothetical protein 


2.04 


1.55 


-0.11 


-0.47 


AH005389 


gill 16488586 


hypothetical protein 


2.02 


1.45 


0.56 


-0.74 


AH006270 


CL433Contigl 


Protease inhibitor 


2.02 


1.78 


0.12 


-0.03 


AH001007 


CL1820Contigl 


hypothetical protein 


2.01 


1.58 


-0.49 


-0.55 


AH007275 


gil 149648362 


hypothetical protein 


1.91 


2.26 


-1.04 


0.03 


AH000272 


gil 1492 13703 


hypothetical protein 


1.91 


1.03 


0.07 


-0.37 


AH006484 


CL2432Contigl 


hypothetical protein 


1.81 


2.10 


0.13 


0.15 


AH001135 


CL2410Contigl 


hypothetical protein 


1.81 


1.31 


-0.41 


-0.44 
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Table 3. Cont. 



AH002728 


gill 10813735 


hypothetical protein 


1.80 


1.50 


0.04 


-0.15 


AH003280 


CL22Contigl 


hypothetical protein 


1.79 


1.88 


-0.40 


-0.65 


AH000855 


CL3205Contigl 


hypothetical protein 


1.79 


1.26 


0.58 


-0.58 


AH004448 


C20L-008-1- 
T3_A01.abl 


hypothetical protein 


1.78 


1.48 


-0.25 


-0.28 


AH000271 


CL6Contig3 


hypothetical protein 


1.76 


1.89 


0.00 


-0.32 


AH003639 


CL919Contigl 


hypothetical protein 


1.76 


1.02 


0.23 


-0.25 


AH004090 


CL1382Contig2 


hypothetical protein 


1.76 


1.40 


0.66 


0.19 


AH003402 


gill 16489695 


BURP domain 


1.76 


1.63 


-0.18 


-0.43 


AH001502 


CL44Contigl 


annexin 


1.74 


1.09 


0.47 


-0.54 


AH007479 


gill 16489554 


hypothetical protein 


1.74 


1.76 


-0.09 


-0.49 


AH004337 


CL888Contigl 


hypothetical protein 


1.74 


1.31 


0.00 


-0.53 


AH008308 


gil 134092873 


hypothetical protein 


1.72 


1.78 


-0.25 


-0.54 


AH003520 


CL844Contig2 


hypothetical protein 


1.71 


1.07 


-0.63 


-1.13 


AH002696 


CL3089Contigl 


hypothetical protein 


1.71 


1.48 


-0.28 


-0.52 


AH005990 


CL1886Contigl 


hypothetical protein 


1.70 


1.74 


0.23 


0.09 


AH003584 


CL1510Contigl 


hypothetical protein 


1.69 


1.05 


-0.12 


-0.54 


AH003682 


CL2193Contigl 


hypothetical protein 


1.63 


2.15 


-1.03 


-0.04 


AH002301 


gill 16488520 


hypothetical protein 


1.62 


1.75 


-0.19 


0.36 


AH004952 


CL582Contigl 


hypothetical protein 


1.58 


1.95 


-0.94 


-0.56 


AH004306 


gil 1340928 18 


hypothetical protein 


1.57 


1.33 


-0.16 


-0.51 


AH005893 


gil 1108 10489 


hypothetical protein 


1.54 


1.50 


-0.11 


0.04 


AH001402 


CL3953Contigl 


hypothetical protein 


1.52 


2.08 -0.14 


0.30 



Note: The values are log 2 ratios. For example, C20Y vs. TFY means log 2 (C20Y/TFY). It is the 
expression level (RPKM) of resistant line GT-C20 compared with the susceptible line Tifrunner 
under Aspergillus infected condition (Y). The values are shaded red if >2 and shaded yellow if the 
values are >1.5 and <2. This applies to Tables 4, 5, 6, and 7. The negative values are shaded green 
if < -2 and shaded dark green if the values are <— 1.5 and >— 2. 



Table 4. Resistant genes constitutively expressed. 



Oligo 


Locus IDs 


Annotation 


C20Y 


C20Y 


C20N 


TFY 


Name 




common name 


vs. 


vs. 


vs. 


vs. 








TFY 


C20N 


TFN 


TFN 


AH003854 


CL974Contigl 


hypothetical protein 


2.51 


0.23 


1.17 


-1.34 


AH004017 


gilll6488752lgbl 
EG529756.1 


hypothetical protein 


2.47 


2.22 


1.57 


-0.05 


AH003192 


CL993Contig2 


27 K protein 


2.35 


1.31 


1.15 


-0.46 


AH000555 


SCL3Contig5 


Cupin 


2.33 


0.16 


2.30 


0.05 


AH002179 


CL48Contig3 


Delta(12)-fatty acid 
dehydrogenase/desaturase 


2.27 


1.22 


1.66 


0.09 


AH002829 


CL432Contigl 


Aminocyclopropanecarboxylate 
oxidase 


2.24 


0.26 


1.28 


-1.44 


AH006795 


CL2798Contigl 


hypothetical protein 


2.13 


0.00 


1.81 


-0.25 
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AH001501 


gill47878026lgbl 
ES538584.1 


US seed storage globulin 


2.06 


0.39 


2.06 


0.18 


AH007210 


CL1250Contigl 


hypothetical protein 


1.99 


1.07 


1.26 


-0.13 


AH004716 


gill49650738lgbl 
ES761721.1 


hypothetical protein 


1.96 


1.40 


1.39 


-0.18 


AH000958 


CL101Contig5 


hypothetical protein 


1.95 


-0.66 


1.91 


-0.32 


AH002424 


gil56690332lgbl 
CX128235.1 


NAD-specific glutamate 
dehydrogenase (NAD-GDH) 


1.95 


0.24 


1.05 


-0.76 


AH003909 


gilll0815482lgbl 
EE126718.1 


hypothetical protein 


1.91 


0.36 


1.03 


-0.45 


AH004786 


gill 1636031 llgbl 
EG374116.1 


Lipoxygenase 


1.90 


-0.15 


1.94 


-0.62 


AH003949 


gilll0815082lgbl 
EE125141.1 


hypothetical protein 


1.85 


0.31 


1.20 


-1.03 


AH002737 


CL121Contigl 


Lipoxygenase 


1.82 


-0.20 


1.69 


-0.53 


AH000129 


CL1899Contigl 


hypothetical protein 


1.77 


-0.11 


1.86 


-0.03 


AH002636 


CL101Contig4 


Cupin 


1.75 


-0.67 


1.88 


-0.22 


AH006951 


CL3246Contigl 


cytochrome P450 
monooxygenase A16 II 
Ent-kaurene oxidase 


1.73 


-0.21 


1.31 


-0.74 


AH006650 


CL1140Contigl 


hypothetical protein 


1.54 


-0.46 


1.07 


-0.67 


AH003334 


gil5726638lgblA 
F172728.ll 


hypothetical protein 


1.53 


-0.02 


1.25 


-0.18 


AH000421 


gilll5597155lgbl 
EG029503.1 


hypothetical protein 


1.51 


0.24 


1.17 


-0.03 



Table 5. Consistently down expressed genes in response to fungal infection. 



Oligo 


Locus IDs 


Annotation 


C20Y 


C20Y 


C20N 


TFY 


Name 




common name 


vs. 


vs. 


vs. 


vs. 








TFY 


C20N 


TFN 


TFN 


AH000690 


gill49218418 


Gamma-thionin family 


-1.51 


-1.71 


1.12 


0.43 


AH002952 


CL1612Contigl 


hypothetical protein 


-1.53 


-1.29 


-0.13 


0.11 


AH000772 


gill 10815456 


hypothetical protein 


-1.53 


-1.08 


0.05 


0.59 


AH007452 


CL3906Contigl 


hypothetical protein 


-1.55 


-1.11 


0.21 


0.60 


AH006152 


CL3978Contigl 


hypothetical protein 


-1.57 


-1.59 


0.15 


NaN 


AH000203 


C20L-061_H10.abl 


hypothetical protein 


-1.60 


-1.45 


-0.26 


0.20 


AH000929 


gill 10812895 


hypothetical protein 


-1.66 


-1.16 


-0.93 


-0.03 


AH006128 


gill 16489533 


CapLEA-2 


-1.72 


-2.37 


2.01 


0.73 


AH005650 


CL45Contigl 


type 4 metallothionein 


-1.72 


-1.97 


1.58 


0.90 


AH000692 


gill 15596393 


hypothetical protein 


-1.73 


-1.34 


0.49 


0.47 


AH001026 


gill 10810654 


ethylene-responsive-binding 


-1.73 


-1.57 


0.11 


0.42 


AH008272 


CL558Contigl 


hypothetical protein 


-1.79 


-1.26 


0.63 


0.47 


AH008014 


gill 16360310 


hypothetical protein 


-1.83 


-1.41 


0.04 


0.31 


AH005128 


gill 34092758 


hypothetical protein 


-1.83 


-1.05 


0.37 


0.35 
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AH003617 


CL2981Contigl 


lipid-transfer protein 3 (LTP 3) 


-1.83 


-1.73 


0.17 


0.12 


AH003051 


gill 16360255 


A Lea protein 


-1.87 


-1.58 


-0.19 


0.54 


AH007250 


gill 10812391 


hypothetical protein 


-1.87 


-1.02 


0.16 


0.29 


AH005235 


gill 1081 1079 


hypothetical protein 


-1.90 


-1.94 


-0.37 


NaN 


AH005769 


CL45Contig2 


hypothetical protein 


-1.90 


-1.90 


1.15 


0.79 


AH003153 


C20L-030_F12.abl 


hypothetical protein 


-1.92 


-2.00 


0.00 


NaN 


AH007709 


CL278Contig2 


seed maturation protein 


-1.94 


-1.80 


0.35 


0.34 


AH003511 


CL2196Contigl 


hypothetical protein 


-1.95 


-1.29 


0.14 


NaN 


AH005270 


gill49655087 


hypothetical protein 


-1.96 


-1.78 


-0.02 


-0.39 


AH004182 


CL3386Contigl 


hypothetical protein 


-1.96 


-1.66 


0.00 


0.43 


AH007249 


C20L-073_B08.abl 


hypothetical protein 


-2.01 


-1.97 


0.95 


-0.09 


AH004352 


CL1037Contigl 


hypothetical protein 


-2.03 


-1.47 


-0.64 


0.09 


AH007522 


gill 16488499 


hypothetical protein 


-2.07 


-1.59 


-0.59 


0.80 


AH000339 


CL1492Contigl 


hypothetical protein 


-2.10 


-1.67 


-0.11 


0.74 


AH004223 


gill 10810696 


hypothetical protein 


-2.11 


-1.80 


0.83 


0.89 


AH007290 


CL526Contigl 


protein binding 


-2.14 


-1.40 


-0.39 


0.57 


AH006565 


gill 16488809 


hypothetical protein 


-2.15 


-1.32 


-0.53 


-0.24 


AH007623 


CL117Contig3 


hypothetical protein 


-2.22 


-1.47 


0.13 


0.53 


AH001324 


CL1130Contigl 


hypothetical protein 


-2.25 


-1.51 


-0.30 


0.61 


AH005317 


CL117Contigl 


hypothetical protein 


-2.34 


-1.65 


0.39 


NaN 


AH003915 


gill 10813009 


hypothetical protein 


-2.46 


-2.11 


0.61 


NaN 


AH003052 


CL3662Contigl 


hypothetical protein 


-2.52 


-1.26 


-0.25 


0.32 


AH003152 


CL953Contigl 


hypothetical protein 


-2.75 


-1.96 


0.00 


NaN 


AH008379 


CL139Contigl 


A Lea protein 


-2.81 


-1.92 


-0.27 


0.67 


AH001493 


CL1974Contigl 


hypothetical protein 


-2.88 


-2.45 


-0.88 


0.17 


AH005219 


CL890Contig2 


seed maturation protein PM22 


-2.90 


-1.97 


-0.50 


0.88 


AH006011 


CL3786Contigl 


hypothetical protein 


-2.98 


-2.32 


0.00 


0.83 


AH008363 


gill49654533 


hypothetical protein 


-3.17 


-2.16 


0.00 


1.09 


Table 6. Constitutively down expressed genes. 


Oligo 
Name 


Locus IDs 


Annotation 
common name 


C20Y 
vs. 
TFY 


C20Y 

vs. 
C20N 


C20N 
vs. 

TFN 


TFY 

vs. 

TFN 


AH005691 


gil30419827 


hypothetical protein 


-1.50 


0.22 


-1.78 


| -0.91 


AH005600 


CL3445Contigl 


hypothetical protein 


-1.52 


0.15 


-1.05 


0.50 


AH004676 


CL3110Contigl 


hypothetical protein 


-1.53 


-0.01 


-1.03 


-0.10 


AH004895 


CL633Contigl 


hypothetical protein 


-1.53 


-0.47 


-1.03 


-0.40 


AH000806 


gil 134037244 


hypothetical protein 


-1.54 


-0.51 


-1.05 


-0.34 


AH008099 


gil 149653228 


hypothetical protein 


-1.59 


0.33 


-1.34 


0.00 


AH000790 


gil 134037353 


hypothetical protein 


-1.73 


-0.47 


-1.18 


0.07 


AH004328 


CL633Contig2 


hypothetical protein 


-1.77 


-0.71 


-1.86 


-1.08 


AH003339 


gil 14677 1647 


hypothetical protein 


-1.84 


0.24 


-1.53 


0.19 


AH000324 


gill 10814436 


hypothetical protein 


-1.92 


-0.49 


-1.58 


0.12 
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AH002332 


C20L-052_A02.abl 


hypothetical protein 


-2.02 


-0.80 


-1.67 


0.00 


AH003062 


CL917Contigl 


hypothetical protein 


-2.08 


-0.70 


-1.73 


-0.81 


AH007482 


CL3870Contigl 


P-enolpyruvate carboxykinase 
(ATP) 


-2.15 


-0.26 


-1.18 


0.09 


AH005991 


gill 108 11967 


hypothetical protein 


-2.17 


-0.51 


-1.27 


0.00 


AH003800 


gill 10812127 


hypothetical protein 


-2.18 


-1.00 


-1.19 


0.03 


AH006388 


C20L-050_C02.abl 


hypothetical protein 


-2.29 


-0.31 


-1.58 


0.16 


AH005712 


C20L-069_A03.abl 


hypothetical protein 


-2.31 


-0.38 


-1.51 


0.00 


AH005618 


CL691Contigl 


hypothetical protein 


-2.32 


0.54 


-1.47 


0.00 


AH006196 


SCL3Contig23 


hypothetical protein 


-2.38 


-0.51 


-1.73 


0.08 


AH007531 


C20L-055_H09.abl 


hypothetical protein 


-2.38 


-0.36 


-1.45 


0.00 


AH002687 


gil72201444 


hypothetical protein 


-2.60 


-0.91 


-1.77 


-0.16 


AH000852 


CL3980Contigl 


hypothetical protein 


-2.64 


-0.38 


-1.62 


0.37 


AH007569 


CL2822Contigl 


hypothetical protein 


-2.74 


-0.15 


-1.68 


0.23 


AH006190 


CL488Contig3 


Thiosulfate sulfurtransferase 


-3.08 


-0.37 


-2.47 


0.00 



Table 7. GO biological processes of differentially expressed genes related to resistance. 



OH an 






^ —\f X 


V —\f X 


C20N 


TFY 


Name 




Common Name 


VS. 


VS. 


VS. 


vs. 








TFY 


C20N 


TFN 


TFN 


AH00395 1 


CL1062Contigl 


Cupin II Oxalate oxidase 


2.54 


2.42 


-0.27 


-0.22 


AH006890 


CL1262Contigl 


SCP-like extracellular protein 


2.29 


2.00 


-0.75 


-1.19 


AH002179 


CL48Contig3 


Fatty acid desaturase 


2.27 


1.22 


1.66 


0.09 


AH002829 


CL432Contigl 


Aminocyclopropanecarboxylate 
oxidase 


2.24 


0.26 


1.28 


-1.44 


AH003578 


CL1337Contigl 


proline-rich protein 


2.16 


2.01 


-1.09 


-0.09 


AH002427 


CL516Contigl 


trypsin protein inhibitor 1 


2.16 


1.27 


0.17 


-0.45 


AH006270 


CL433Contigl 


Protease inhibitor/seed 
storage/LTP family 


2.02 


1.78 


0.12 


-0.03 


AH004786 


gill 163603 11 


Lipoxygenase 


1.90 


-0.15 


1.94 


-0.62 


AH001480 


CL3357Contigl 


polygalacturonase 


1.89 


1.20 


0.11 


-0.70 


AH002965 


gill46771622 


gibberellin regulated protein 


1.83 


1.48 


0.16 


-0.16 


AH002737 


CL121Contigl 


Lipoxygenase II Lipoxygenase 


1.82 


-0.20 


1.69 


-0.53 


AH001196 


gilll5597159 


Caffeate or O-diphenol-O- 
methyl transferase 


1.80 


0.53 


0.34 


-0.79 


AH006951 


CL3246Contigl 


P450 monooxygenase A 16 II 
Ent-kaurene oxidase 


1.73 


-0.21 


1.31 


-0.74 


AH004519 


CL3199Contigl 


Protease inhibitor/seed 
storage/LTP family 


1.35 


1.83 


-0.49 


0.12 


AH002451 


CL2579Contigl 


lea protein 2 


-0.78 


-1.82 


1.52 


0.36 


AH007902 


gil 149650530 


SCP-like extracellular protein 


-1.60 


0.71 


0.00 


-0.94 


AH006128 


gill 16489533 


late embryogenesis abundant 
protein 2 


-1.72 


-2.37 


2.01 


0.73 


AH005650 


CL45Contigl 


type 4 metallothionein 


-1.72 


-1.97 


1.58 


0.90 
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AH001026 


gill 10810654 


ethylene-responsive 
element-binding protein 


-1.73 


-1.57 


0.11 


0.42 


AH003617 


CL2981Contigl 


Non-specific lipid-transfer 
protein 3 (LTP 3). 


-1.83 


-1.73 


0.17 


0.12 


AH003051 


gill 16360255 


A Lea protein 


-1.87 


-1.58 


-0.19 


0.54 


AH004842 


CL2270Contigl 


auxin-responsive protein-related 


-1.88 


-0.88 


0.00 


0.15 


AH000878 


gill 10813054 


glyoxalase family protein 


-2.18 


-1.35 


-0.52 


0.60 


AH004582 


gill 16488861 


glutathione S-transferase 


-2.28 


-1.30 


0.23 


0.49 


AH008379 


CL139Contigl 


A Lea protein 


-2.81 


-1.92 


-0.27 


0.67 



3.3. Genes Resistant to Fungal Infection in Other Crop Systems have been Identified 



Among the genes whose putative biological functions have been postulated, we identified quite a 
few genes that were reportedly showing resistant to Aspergillus infection in other crop systems 
(Table 7). The trypsin protein inhibitor 1 (CL516Contigl) was demonstrated to be resistant to A.flavus 
infection in corn [24-27]. The lipoxygenase (CL121Contigl) also showed anti-fungal activities in 
peanut, corn, and soybean [28-31]. Several lines of evidence have indicated that lipoxygenase 
enzymes and their products, especially 9S- and 13S-hydroperoxy fatty acids, could play a role in the 
Aspergillus! seed interaction. Both hydroperoxides exhibit sporogenic effects on Aspergillus spp. and 
differentially modulate aflatoxin pathway gene transcription. 

Previous studies through gene cloning and characterization reported [28-32] the role of seed 
lipoxygenases, a peanut seed gene, PnLOXl. Analysis of its nucleotide sequence suggests that 
PnLOXl encodes a predicted 98 kDa protein highly similar in sequence and biochemical properties to 
soybean LOX2. The full-length PnLOXl cDNA was subcloned into an expression vector to determine 
the type(s) of hydroperoxide products that the enzyme produces. Analysis of the oxidation products 
of PnLOXl revealed that it produced a mixture of 30% 9S-HPODE (9S-hydroperoxy-10E, 
12Z-octadecadienoic acid) and 70% 13S-HPODE (13S-hydroperoxy-9Z, HE-octadecadienoic acid) at 
pH 7. PnLOXl is an organ-specific gene which is constitutively expressed in immature cotyledons but 
is highly induced by methyl jasmonate, wounding, and Aspergillus infections in mature cotyledons. 
Examination of HPODE production in infected cotyledons suggests PnLOXl expression may lead to 
an increase in 9S- HPODE in the seed [28-32]. The human lipoxigenase was also reported to degrade 
aflatoxin Bl by oxidative metabolism [33,34]. Those genes demonstrating resistance to fungal 
infection in other crops such as corn and soybean were also identified in peanut through this 
microarray gene profiling experiment. This result indicates that our data are consistent with previous 
studies in other crops and that this study provides new evidence for the roles of these proteins in 
protection against fungal infection. 

3.4. Defense-Related Genes Identified by Peanut Seed EST Database Search 

The EST sequences from "GT-C20" and "Tifrunner" were compared individually against peanut 
seed EST database. Among the EST sequences with R > 4 [35], only three up-regulated putative 
defense-related genes were identified in both "GT-C20" and "Tifrunner" seed libraries. They were 
putative desiccation -related protein PCC 13-62 precursor, serine protease inhibitor, and seed maturation 
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protein LEA 4. Six up-regulated EST sequences were observed only in "GT-C20" seed EST libraries, 
and matched previous reported known proteins including PR 10 protein, defensin protein, and 
calmodulin. In the "Tifrunner" seed EST libraries, five defense-related genes such as metallothionein- 
like protein, heat shock protein and Cu/Zn superoxide dismutase II exhibited significant up -regulation. 

In the microarray experiments, several of the late embryo abundant (LEA) or late embryogenesis 
protein (LEA proteins) (CL2579Contigl and CL139Contigl) were demonstrated highly or moderately 
down-expressed during fungal infection. The growth hormone genes for ethylene and auxin-responsive 
proteins (CL2270Contigl) were also down-expressed upon fungal infection. It is interesting to find 
that one of the SCP-like extracellular proteins was up-expressed (CL1262Contigl) while the other 
SCP-like extracellular proteins were down-expressed (ES761513.1IES761513). The mechanisms of 
their expression in response to fungal infection deserves further investigation. 

4. Conclusions 

We described the sequence and assembly of 13,879 unique peanut ESTs, designed and constructed 
a 6932 gene-element oligonucleotide microarray, and analyzed the results of gene screening on the 
resistant genes in peanut in response to Aspergillus infection. More importantly, we identified resistant 
genes that are highly expressed in response to fungal infection. These genes could be valuable 
resources for follow-on research to transfer genes into commercial peanut cultivars through 
conventional breeding, marker assisted breeding, or through gene transfer by biotechnology. In 
addition, genetic regulation may be employed to boost the expression levels of these genes in the 
commercial cultivars to reduce or prevent aflatoxin contamination in peanut crop. EST and microarray 
technology has been demonstrated as robust in screening and identifying resistant or susceptible genes 
in large scale if not at the genome scale. Due to the lack of peanut whole genome sequence progress, 
the majority of the ESTs encode hypothetical proteins with unknown functions. We here demonstrate 
that using EST sequences and microarray strategies to screen and profile resistance genes provides a 
robust approach for identifying resistance genes and resistance gene candidates in the absence of a 
peanut genome sequence. Data presented in this report significantly identified gene targets for future 
crop improvement manipulation. Both the research methods and the resulting data will prove useful in 
crop improvement and aflatoxin contamination prevention. 
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